Identify High-Quality Protein Structural Models by Enhanced K-Means

نویسندگان

  • Hongjie Wu
  • Haiou Li
  • Min Jiang
  • Cheng Chen
  • Qiang Lv
  • Chuang Wu
چکیده

Background. One critical issue in protein three-dimensional structure prediction using either ab initio or comparative modeling involves identification of high-quality protein structural models from generated decoys. Currently, clustering algorithms are widely used to identify near-native models; however, their performance is dependent upon different conformational decoys, and, for some algorithms, the accuracy declines when the decoy population increases. Results. Here, we proposed two enhanced K-means clustering algorithms capable of robustly identifying high-quality protein structural models. The first one employs the clustering algorithm SPICKER to determine the initial centroids for basic K-means clustering (SK-means), whereas the other employs squared distance to optimize the initial centroids (K-means++). Our results showed that SK-means and K-means++ were more robust as compared with SPICKER alone, detecting 33 (59%) and 42 (75%) of 56 targets, respectively, with template modeling scores better than or equal to those of SPICKER. Conclusions. We observed that the classic K-means algorithm showed a similar performance to that of SPICKER, which is a widely used algorithm for protein-structure identification. Both SK-means and K-means++ demonstrated substantial improvements relative to results from SPICKER and classical K-means.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rice Classification and Quality Detection Based on Sparse Coding Technique

Classification of various rice types and determination of its quality is a major issue in the scientific and commercial fields associated with modern agriculture. In recent years, various image processing techniques are used to identify different types of agricultural products. There are also various color and texture-based features in order to achieve the desired results in this area. In this ...

متن کامل

A statistical approach to classify Skype traffic

Abstract- Skype is one of the most powerful and high-quality chat tools that allows its users to use of many services such as: transferring audio, sending messages, video conferencing and audio for free. Skype traffic has a lot of Internet traffic. Hence, Internet service providers need to identify traffic to do the quality of service and network management. On the other hand, Skype developers ...

متن کامل

An Enhanced K-Means Algorithm for Water Quality Analysis of The Haihe River in China

The increase and the complexity of data caused by the uncertain environment is today's reality. In order to identify water quality effectively and reliably, this paper presents a modified fast clustering algorithm for water quality analysis. The algorithm has adopted a varying weights K-means cluster algorithm to analyze water monitoring data. The varying weights scheme was the best weighting i...

متن کامل

Modifying PIARC’s Linear Model of Accident Severity Index to Identify Roads' Accident Prone Spots to Rehabilitate Pavements Considering Nonlinear Effects of the Traffic Volume

Pavement rehabilitation could affect the accident severity index (ASI) since restoration measures means more safety for road users. No research or project has been carried out to identify hazard points to build a linear model based on crash severity index. One of the very popular accident severity index models used in all countries is based on linear models to rehabilitate pavements and this pa...

متن کامل

A Modified Empirical Path Loss Model for 4G LTE Network in Lagos, Nigeria

The quality of signal at a particular location is essential to determine the performance of mobile system. The problem of poor network in Lagos, Nigeria needs to be addressed especially now that the attention is toward online learning and meetings. Existing empirical Path Loss (PL) models designed elsewhere are not appropriate for predicting the 4G Long-Term Evolution (LTE) signal in Nigeria. T...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2017  شماره 

صفحات  -

تاریخ انتشار 2017